GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information

نویسندگان

  • Antonios Kyparissiadis
  • Walter J B van Heuven
  • Nicola J Pitchford
  • Timothy Ledgeway
چکیده

Databases containing lexical properties on any given orthography are crucial for psycholinguistic research. In the last ten years, a number of lexical databases have been developed for Greek. However, these lack important part-of-speech information. Furthermore, the need for alternative procedures for calculating syllabic measurements and stress information, as well as combination of several metrics to investigate linguistic properties of the Greek language are highlighted. To address these issues, we present a new extensive lexical database of Modern Greek (GreekLex 2) with part-of-speech information for each word and accurate syllabification and orthographic information predictive of stress, as well as several measurements of word similarity and phonetic information. The addition of detailed statistical information about Greek part-of-speech, syllabification, and stress neighbourhood allowed novel analyses of stress distribution within different grammatical categories and syllabic lengths to be carried out. Results showed that the statistical preponderance of stress position on the pre-final syllable that is reported for Greek language is dependent upon grammatical category. Additionally, analyses showed that a proportion higher than 90% of the tokens in the database would be stressed correctly solely by relying on stress neighbourhood information. The database and the scripts for orthographic and phonological syllabification as well as phonetic transcription are available at http://www.psychology.nottingham.ac.uk/greeklex/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A database design for a TTS synthesis system using lexical diphones

Database designs, if based on the premise that there are about 2000 diphones in English, as stated in many publications and on-line documents, are likely to render a database of diphones, which will fail to capture some important phonological phenomena of English. This paper proposes a TTS database, which is built from diphones inclusive of their syllabic stress; we term these units lexical dip...

متن کامل

Relationship between rapid naming, phonological awareness and spelling in Farsi normally developing children

Introduction: The aim of the present study was to investigate relationship among phonological awareness, rapid naming and spelling in normally developing Farsi speaking children. Materials and Methods: In this description- analytical cross sectional study 30 normally developing students randomly selected from Tehran (Iran). The students performed Nama reading and dyslexia, rapid automatized nam...

متن کامل

Early Phonological and Lexical Development of a Farsi Speaking Child: A Longitudinal Case Study

The present study aims at the description and analysis of the phonological and lexical development of a child who is acquiring Farsi as his first language. The child's language production at the holophrastic stage of language development, mainly single words, is observed and recorded  longitudinally for nearly seven  months since he was 16 months old until he turned 23 months. An attempt is mad...

متن کامل

Design and Implementation of an Intelligent Part of Speech Generator

The aim of this paper is to report on an attempt to design and implement an intelligent system capable of generating the correct part of speech for a given sentence while the sentence is totally new to the system and not stored in any database available to the system. It follows the same steps a normal individual does to provide the correct parts of speech using a natural language processor. It...

متن کامل

The WEAVER model of word-form encoding in speech production.

Lexical access in speaking consists of two major steps: lemma retrieval and word-form encoding. In Roelofs (Roelofs, A. 1992a. Cognition 42. 107-142; Roelofs. A. 1993. Cognition 47, 59-87.), I described a model of lemma retrieval. The present paper extends this work by presenting a comprehensive model of the second access step, word-form encoding. The model is called WEAVER (Word-form Encoding ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2017